Central Region
Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management
Lu, Miao, Sun, Weiwei, Du, Weihua, Ling, Zhan, Yao, Xuesong, Liu, Kang, Chen, Jiecao
We study reinforcement learning (RL) fine-tuning of large language model (LLM) agents for long-horizon multi-turn tool use, where context length quickly becomes a fundamental bottleneck. Existing RL pipelines can suffer from degraded instruction following, excessive rollout costs, and most importantly, strict context limits. To address these challenges, we introduce summarization-based context management to training. In specific, it periodically compresses the tool using history by LLM-generated summaries that retain task-relevant information to keep a compact context while enabling the agent to scale beyond the fixed context window. Building on this formulation, we derive a policy gradient representation that seamlessly enables standard LLM RL infrastructures to optimize both tool-use behaviors as well as summarization strategies in an end-to-end fashion. We instantiate this framework with \underline{SU}mmarization augmented \underline{P}olicy \underline{O}ptimization (\texttt{SUPO}), an LLM RL algorithm that enables long-horizon training beyond a fixed context limit. Experiments on interactive function calling and searching tasks demonstrate that \texttt{SUPO} significantly improves the success rate while maintaining the same or even lower working context length compared to baselines. We also demonstrate that for complex searching tasks, \texttt{SUPO} can further improve the evaluation performance when scaling test-time maximum round of summarization beyond that of training time. Our results establish summarization-based context management as a principled and scalable approach for training RL agents beyond a fixed context length limit.
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
- Africa > Kenya > Nairobi City County > Nairobi (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (5 more...)
- Workflow (1.00)
- Overview (0.92)
- Research Report > New Finding (0.34)
- Banking & Finance (0.94)
- Education (0.68)
- Government > Regional Government > Asia Government (0.46)
The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP
Issaka, Sheriff, Wang, Keyi, Ajibola, Yinka, Samuel-Ipaye, Oluwatumininu, Zhang, Zhaoyi, Jimenez, Nicte Aguillon, Agyei, Evans Kofi, Lin, Abraham, Ramachandran, Rohan, Mumin, Sadick Abdul, Nchifor, Faith, Shuraim, Mohammed, Liu, Lieqi, Gonzalez, Erick Rosas, Kpei, Sylvester, Osei, Jemimah, Ajeneza, Carlene, Boateng, Persis, Yeboah, Prisca Adwoa Dufie, Gabriel, Saadia
Despite representing nearly one-third of the world's languages, African languages remain critically underserved by modern NLP technologies, with 88\% classified as severely underrepresented or completely ignored in computational linguistics. We present the African Languages Lab (All Lab), a comprehensive research initiative that addresses this technological gap through systematic data collection, model development, and capacity building. Our contributions include: (1) a quality-controlled data collection pipeline, yielding the largest validated African multi-modal speech and text dataset spanning 40 languages with 19 billion tokens of monolingual text and 12,628 hours of aligned speech data; (2) extensive experimental validation demonstrating that our dataset, combined with fine-tuning, achieves substantial improvements over baseline models, averaging +23.69 ChrF++, +0.33 COMET, and +15.34 BLEU points across 31 evaluated languages; and (3) a structured research program that has successfully mentored fifteen early-career researchers, establishing sustainable local capacity. Our comparative evaluation against Google Translate reveals competitive performance in several languages while identifying areas that require continued development.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Middle East > Jordan (0.04)
- (29 more...)
- Information Technology (0.67)
- Education (0.67)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Who will be the next Pope? AI predicts the new head of the Roman Catholic Church after Pope Francis dies
Following the death of Pope Francis at the age of 88, the Catholic Church must now begin the lengthy process of electing his successor. Starting at least 15 days after his death, the 135 eligible cardinals will be locked away in the legendary Conclave until they have chosen the next pope. But if you just can't wait for the world's most secretive election to run its course, MailOnline has used AI to predict the result. According to OpenAI's ChatGPT, the man set to become the next head of the Roman Catholic Church is Cardinal Pietro Parolin. As the AI points out, the 70-year-old Italian priest is seen by many as the natural heir to Pope Francis' legacy and holds an edge in current betting markets. ChatGPT said: 'As Vatican Secretary of State since 2013, Parolin is viewed as the "continuity" candidate - acceptable to both reformers and traditionalists.
- Asia > China (0.15)
- Asia > Philippines > Luzon > National Capital Region > City of Manila (0.06)
- Europe > Holy See > Vatican City (0.05)
- Africa > Ghana > Central Region > Cape Coast (0.05)
Explainable artificial intelligence (XAI): from inherent explainability to large language models
Mumuni, Fuseini, Mumuni, Alhassan
Artificial Intelligence (AI) has continued to achieve tremendous success in recent times. However, the decision logic of these frameworks is often not transparent, making it difficult for stakeholders to understand, interpret or explain their behavior. This limitation hinders trust in machine learning systems and causes a general reluctance towards their adoption in practical applications, particularly in mission-critical domains like healthcare and autonomous driving. Explainable AI (XAI) techniques facilitate the explainability or interpretability of machine learning models, enabling users to discern the basis of the decision and possibly avert undesirable behavior. This comprehensive survey details the advancements of explainable AI methods, from inherently interpretable models to modern approaches for achieving interpretability of various black box models, including large language models (LLMs). Additionally, we review explainable AI techniques that leverage LLM and vision-language model (VLM) frameworks to automate or improve the explainability of other machine learning models. The use of LLM and VLM as interpretability methods particularly enables high-level, semantically meaningful explanations of model decisions and behavior. Throughout the paper, we highlight the scientific principles, strengths and weaknesses of state-of-the-art methods and outline different areas of improvement. Where appropriate, we also present qualitative and quantitative comparison results of various methods to show how they compare. Finally, we discuss the key challenges of XAI and directions for future research.
- Africa > Ghana > Central Region > Cape Coast (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Africa > Ghana > Western Region > Tarkwa (0.04)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Research Report > Promising Solution (0.87)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Information Technology > Security & Privacy (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset
Olatunji, Tobi, Nimo, Charles, Owodunni, Abraham, Abdullahi, Tassallah, Ayodele, Emmanuel, Sanni, Mardhiyah, Aka, Chinemelu, Omofoye, Folafunmi, Yuehgoh, Foutse, Faniran, Timothy, Dossou, Bonaventure F. P., Yekini, Moshood, Kemp, Jonas, Heller, Katherine, Omeke, Jude Chidubem, MD, Chidi Asuzu, Etori, Naome A., Ndiaye, Aimérou, Okoh, Ifeoma, Ocansey, Evans Doe, Kinara, Wendy, Best, Michael, Essa, Irfan, Moore, Stephen Edward, Fourie, Chris, Asiedu, Mercy Nyamewaa
Recent advancements in large language model(LLM) performance on medical multiple choice question (MCQ) benchmarks have stimulated interest from healthcare providers and patients globally. Particularly in low-and middle-income countries (LMICs) facing acute physician shortages and lack of specialists, LLMs offer a potentially scalable pathway to enhance healthcare access and reduce costs. However, their effectiveness in the Global South, especially across the African continent, remains to be established. In this work, we introduce AfriMed-QA, the first large scale Pan-African English multi-specialty medical Question-Answering (QA) dataset, 15,000 questions (open and closed-ended) sourced from over 60 medical schools across 16 countries, covering 32 medical specialties. We further evaluate 30 LLMs across multiple axes including correctness and demographic bias. Our findings show significant performance variation across specialties and geographies, MCQ performance clearly lags USMLE (MedQA). We find that biomedical LLMs underperform general models and smaller edge-friendly LLMs struggle to achieve a passing score. Interestingly, human evaluations show a consistent consumer preference for LLM answers and explanations when compared with clinician answers.
- Africa > South Africa (0.04)
- Africa > Nigeria (0.04)
- Africa > Malawi (0.04)
- (18 more...)
Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches
Mumuni, Alhassan, Mumuni, Fuseini
Generative artificial intelligence (AI) systems based on large-scale pretrained foundation models (PFMs) such as vision-language models, large language models (LLMs), diffusion models and vision-language-action (VLA) models have demonstrated the ability to solve complex and truly non-trivial AI problems in a wide variety of domains and contexts. Multimodal large language models (MLLMs), in particular, learn from vast and diverse data sources, allowing rich and nuanced representations of the world and, thereby, providing extensive capabilities, including the ability to reason, engage in meaningful dialog; collaborate with humans and other agents to jointly solve complex problems; and understand social and emotional aspects of humans. Despite this impressive feat, the cognitive abilities of state-of-the-art LLMs trained on large-scale datasets are still superficial and brittle. Consequently, generic LLMs are severely limited in their generalist capabilities. A number of foundational problems -- embodiment, symbol grounding, causality and memory -- are required to be addressed for LLMs to attain human-level general intelligence. These concepts are more aligned with human cognition and provide LLMs with inherent human-like cognitive properties that support the realization of physically-plausible, semantically meaningful, flexible and more generalizable knowledge and intelligence. In this work, we discuss the aforementioned foundational issues and survey state-of-the art approaches for implementing these concepts in LLMs. Specifically, we discuss how the principles of embodiment, symbol grounding, causality and memory can be leveraged toward the attainment of artificial general intelligence (AGI) in an organic manner.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Africa > Ghana > Central Region > Cape Coast (0.04)
- North America > United States > Virginia (0.04)
- (9 more...)
- Overview (1.00)
- Research Report > Promising Solution (0.65)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- (2 more...)
Unveiling Topological Structures in Text: A Comprehensive Survey of Topological Data Analysis Applications in NLP
The surge of data available on the internet has led to the adoption of various computational methods to analyze and extract valuable insights from this wealth of information. Among these, the field of Machine Learning (ML) has thrived by leveraging data to extract meaningful insights. However, ML techniques face notable challenges when dealing with real-world data, often due to issues of imbalance, noise, insufficient labeling, and high dimensionality. To address these limitations, some researchers advocate for the adoption of Topological Data Analysis (TDA), a statistical approach that discerningly captures the intrinsic shape of data despite noise. Despite its potential, TDA has not gained as much traction within the Natural Language Processing (NLP) domain compared to structurally distinct areas like computer vision. Nevertheless, a dedicated community of researchers has been exploring the application of TDA in NLP, yielding 87 papers we comprehensively survey in this paper. Our findings categorize these efforts into theoretical and non-theoretical approaches. Theoretical approaches aim to explain linguistic phenomena from a topological viewpoint, while non-theoretical approaches merge TDA with ML features, utilizing diverse numerical representation techniques. We conclude by exploring the challenges and unresolved questions that persist in this niche field. Resources and a list of papers on this topic can be found at: https://github.com/AdaUchendu/AwesomeTDA4NLP.
- Oceania > Australia (0.04)
- North America > United States > Texas (0.04)
- North America > United States > Missouri > Greene County > Springfield (0.04)
- (13 more...)
- Overview (1.00)
- Research Report > New Finding (0.34)
- Government (0.93)
- Information Technology > Security & Privacy (0.69)
- Health & Medicine > Therapeutic Area (0.46)
Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO
Mumuni, Fuseini, Mumuni, Alhassan
Grounding DINO and the Segment Anything Model (SAM) have achieved impressive performance in zero-shot object detection and image segmentation, respectively. Together, they have a great potential to revolutionize applications in zero-shot semantic segmentation or data annotation. Yet, in specialized domains like medical image segmentation, objects of interest (e.g., organs, tissues, and tumors) may not fall in existing class names. To address this problem, the referring expression comprehension (REC) ability of Grounding DINO is leveraged to detect arbitrary targets by their language descriptions. However, recent studies have highlighted severe limitation of the REC framework in this application setting owing to its tendency to make false positive predictions when the target is absent in the given image. And, while this bottleneck is central to the prospect of open-set semantic segmentation, it is still largely unknown how much improvement can be achieved by studying the prediction errors. To this end, we perform empirical studies on six publicly available datasets across different domains and reveal that these errors consistently follow a predictable pattern and can, thus, be mitigated by a simple strategy. Specifically, we show that false positive detections with appreciable confidence scores generally occupy large image areas and can usually be filtered by their relative sizes. More importantly, we expect these observations to inspire future research in improving REC-based detection and automated segmentation. Meanwhile, we evaluate the performance of SAM on multiple datasets from various specialized domains and report significant improvements in segmentation performance and annotation time savings over manual approaches.
- Health & Medicine > Diagnostic Medicine > Imaging (0.68)
- Information Technology > Security & Privacy (0.67)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Automated data processing and feature engineering for deep learning and big data applications: a survey
Mumuni, Alhassan, Mumuni, Fuseini
Modern approach to artificial intelligence (AI) aims to design algorithms that learn directly from data. This approach has achieved impressive results and has contributed significantly to the progress of AI, particularly in the sphere of supervised deep learning. It has also simplified the design of machine learning systems as the learning process is highly automated. However, not all data processing tasks in conventional deep learning pipelines have been automated. In most cases data has to be manually collected, preprocessed and further extended through data augmentation before they can be effective for training. Recently, special techniques for automating these tasks have emerged. The automation of data processing tasks is driven by the need to utilize large volumes of complex, heterogeneous data for machine learning and big data applications. Today, end-to-end automated data processing systems based on automated machine learning (AutoML) techniques are capable of taking raw data and transforming them into useful features for Big Data tasks by automating all intermediate processing stages. In this work, we present a thorough review of approaches for automating data processing tasks in deep learning pipelines, including automated data preprocessing--e.g., data cleaning, labeling, missing data imputation, and categorical data encoding--as well as data augmentation (including synthetic data generation using generative AI methods) and feature engineering--specifically, automated feature extraction, feature construction and feature selection. In addition to automating specific data processing tasks, we discuss the use of AutoML methods and tools to simultaneously optimize all stages of the machine learning pipeline.
- Africa > Ghana > Central Region > Cape Coast (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- Oceania > New Zealand > North Island > Waikato (0.04)
- (5 more...)
- Overview (1.00)
- Research Report > New Finding (0.45)
- Research Report > Experimental Study (0.45)
- Information Technology > Software (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- (4 more...)
A survey of synthetic data augmentation methods in computer vision
Mumuni, Alhassan, Mumuni, Fuseini, Gerrar, Nana Kobina
The standard approach to tackling computer vision problems is to train deep convolutional neural network (CNN) models using large-scale image datasets which are representative of the target task. However, in many scenarios, it is often challenging to obtain sufficient image data for the target task. Data augmentation is a way to mitigate this challenge. A common practice is to explicitly transform existing images in desired ways so as to create the required volume and variability of training data necessary to achieve good generalization performance. In situations where data for the target domain is not accessible, a viable workaround is to synthesize training data from scratch--i.e., synthetic data augmentation. This paper presents an extensive review of synthetic data augmentation techniques. It covers data synthesis approaches based on realistic 3D graphics modeling, neural style transfer (NST), differential neural rendering, and generative artificial intelligence (AI) techniques such as generative adversarial networks (GANs) and variational autoencoders (VAEs). For each of these classes of methods, we focus on the important data generation and augmentation techniques, general scope of application and specific use-cases, as well as existing limitations and possible workarounds. Additionally, we provide a summary of common synthetic datasets for training computer vision models, highlighting the main features, application domains and supported tasks. Finally, we discuss the effectiveness of synthetic data augmentation methods. Since this is the first paper to explore synthetic data augmentation methods in great detail, we are hoping to equip readers with the necessary background information and in-depth knowledge of existing methods and their attendant issues.
- Africa > Ghana > Central Region > Cape Coast (0.04)
- North America > United States > Kansas > Sheridan County (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (6 more...)
- Research Report (1.00)
- Overview (1.00)
- Information Technology (1.00)
- Leisure & Entertainment > Games > Computer Games (0.93)
- Health & Medicine > Diagnostic Medicine > Imaging (0.93)
- (2 more...)